CHAPTER 19 Other Useful Kinds of Regression 273
follow one of several different distribution functions, such as normal, expo-
nential, binomial (as in logistic regression), or Poisson.»
» With LM, the linear combination becomes the predicted value of the outcome,
but with GLM, you can specify a link function. The link function is a transforma-
tion that turns the linear combination into the predicted value. As we note in
Chapter 18, logistic regression applies exactly this kind of transformation: Let’s
call the linear combination V. In logistic regression, V is sent through the
logistic function 1
1
/
e V to convert it into a predicted probability of
having the outcome event. So if you select the correct link function, you can
use GLM to perform logistic regression.
GLM is the Swiss army knife of regression. If you select the correct link function,
you can use it to do ordinary least-squares regression, logistic regression, Poisson
regression, and a whole lot more. Most statistical software offers a GLM function;
that way, other specialized regressions don’t need to be programmed. If the soft-
ware you are using doesn’t offer logistic or Poisson regression, check to see
whether it offers GLM, and if it does, use that instead. (Flip to Chapter 4 for an
introduction to statistical software.)
Running a Poisson regression
Suppose that you want to study the number of fatal highway accidents per year in
a city. Table 19-1 shows some made-up fatal-accident data over the course of
12 years. Figure 19-1 shows a graph of this data, created using the R statistical
software package.
Running a Poisson regression is similar in many ways to running the other com-
mon kinds of regression, but there are some differences. Here are the steps:
1.
As with any regression, prepare your predictor and outcome variables in
your data.
For this example, you have a row of data for each year, so year is the experi-
mental unit. For each row, you have a column containing the outcome values,
which is number of accidents each year (Accidents). Since you have one
predictor — which is year — you have a column for Year.
2.
Tell the software which variables are the predictor variables, and which
one is the outcome.